首页> 外文OA文献 >Automatic Annotation and Assessment of Syntactic Structures in Law Texts Combining Rule-Based and Statistical Methods
【2h】

Automatic Annotation and Assessment of Syntactic Structures in Law Texts Combining Rule-Based and Statistical Methods

机译:基于规则和统计方法相结合的法律文本句法结构自动注释和评估

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In this thesis, I investigate and develop methods for automatically analyzing and assess- ing German syntactic structures in domain-specific texts. As domain-specific texts, I use Swiss German-language law texts.\udThe automatic annotation of syntactic structures has long been studied in the research on natural language processing. Supervised statistical methods are regarded as state-of-the- art parsing methods, which are accurate but biased by the type of text. Consequently, the accuracy of statistical parsers decreases if they are used on domain-specific texts. The problem of domain bias in syntactic annotation should be solved if it directly affects the accuracy of an application. The syntactic assessment that I develop in this thesis is such an application that requires high accuracy of syntactic annotation. An effective solution to this problem would be the manual annotation of a large portion of the required domain texts. However, it is not feasible in practice because manual linguistic annotation is extremely labor intensive. To overcome this problem, I develop syntactic annotation methods that do not require the manual annotation of a large portion of the domain texts. The goal of this thesis is that the annotation accuracy on domain-specific texts is so high that it can be used for the application.\udFor the automatic syntactic assessment, I demonstrate a novel approach to model domain-specific style choice by combining rule-based and statistical methods. In the rule-based approach, I present a method that automatically detects the violations of style rules in legislative style guidelines. In the statistical approach, domain-specific writing style is defined in terms of stylistic choice between syntactic alternations. The syntactic selection is statistically modeled by classifying syntactic alternatives according to their syntactic complexity. The syntactic assessment requires automatic syntactic annotation.\udFor the automatic syntactic annotation, I present a linguistically motivated hybrid su- pertagger that analyzes topological dependency grammar relations in the German lan- guage. In this thesis, supertagging problems are seen as morphosyntactic ambiguity and syntactic resolution. Depending on the linguistic phenomena, the ambiguity is resolved by applying a rule-based and statistical tagging method: Morphological and syntactic hard constraints are applied in a constraint grammar approach. In contrast, lexical, semantic, and pragmatic soft and multivariate constraints are integrated into a conditional random fields model.\udThe main contribution of this thesis to the study of natural language processing is to show that a linguistically motivated annotation method is a viable approach to achieving a high performance of syntactic analysis with a few hundreds of manually annotated sentences from the domain.
机译:在本文中,我研究和开发了在领域特定文本中自动分析和评估德语句法结构的方法。作为特定领域的文本,我使用瑞士德语语言的法律文本。\ ud在自然语言处理的研究中,对句法结构的自动注释进行了长期的研究。有监督的统计方法被认为是最新的语法分析方法,虽然准确但受文本类型的影响。因此,如果将统计分析器用于特定于域的文本,则会降低其准确性。如果句法注释直接影响应用程序的准确性,则应解决该问题。我在本文中开发的句法评估是一种需要高精度句法注释的应用程序。一个有效的解决方案是手动注释大部分所需的域文本。但是,由于手工语言注释非常费力,因此在实践中不可行。为克服此问题,我开发了不需要大量域文本的手动注释的句法注释方法。本文的目标是,对特定于领域的文本进行注释的准确性很高,以至于可以用于应用程序。\ ud对于自动句法评估,我演示了一种通过结合规则-来建模特定于领域的样式选择的新颖方法。基础和统计方法。在基于规则的方法中,我提出了一种方法,该方法可自动检测立法样式指南中样式规则的违反情况。在统计方法中,特定领域的写作风格是根据句法交替之间的风格选择来定义的。通过根据句法选择的句法复杂性对句法选择进行分类,可以对句法选择进行统计建模。句法评估需要自动句法注释。\ ud对于自动句法注释,我提出了一种语言动机的混合超语言,它分析了德语语言中的拓扑依存语法关系。在本文中,超标记问题被视为形态句法歧义和句法解决。取决于语言现象,可以通过应用基于规则的统计标记方法来解决歧义:在约束语法方法中应用形态和语法上的硬约束。相比之下,词汇,语义和语用的软约束和多元约束被集成到条件随机字段模型中。\ ud本论文对自然语言处理的研究的主要贡献是表明,基于语言的注释方法是一种可行的方法。通过使用来自域的数百个手动注释的句子来实现高性能的句法分析。

著录项

  • 作者

    Sugisaki, Kyoko;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号